Our team is pleased to submit the following analysis in response to DDSAnalytics request for support in identifying factors contributing to attrition and any other possible trends associated with job roles within the work force and provided data set. In the conduct of analysis, we have employed a number of exploratory data analysis techniques and are confident that we have unmasked some significant insights into these requests.
All of the files, code, and presentation materials used in support of this submission are available to DDS Analytics at the following GitHub repository: https://github.com/Ujustwaite/DDS_Case_Study_2
Being unfamiliar with the DDSAnalytics workforce or structure, we began our analysis with a robust examination of the operating construct, workforce composition, job roles, departments and other aspects contained within the data.
The data we were provided contained records for 1,470 current and former employees of DDSAnalytics. These records broke down as follows.
We observed three departments within the organization. The Human Resources or HR departement was by far the smallest and Research and Development or R&D was the largest, with Sales somewhere in between.
| Department | Employee Count | % of Total Employees |
|---|---|---|
| HR | 63 | 4.3% |
| R&D | 961 | 65.4% |
| Sales | 446 | 30.3% |
The age distribution of the population represented by the data is normally distributed as we might expect. In general, we do not see any concerning trends here with regards to the age of the workforce. In other words, there is no massive wave of retirees impending nor are there any indications of a junior-biased workforce that would indicate a lack of professional experience. We examine this further, but at this point the data appear to be representative of the national population as a whole.
| AgeGroup | Employee Count | Male Employees | Male Employee % |
|---|---|---|---|
| <20 | 17 | 9 | 52.9% |
| 20-29 | 309 | 197 | 63.8% |
| 30-39 | 622 | 371 | 59.6% |
| 40-49 | 349 | 205 | 58.7% |
| 50-59 | 168 | 97 | 57.7% |
| 60+ | 5 | 3 | 60.0% |
Nine (9) unique job roles were identified within the data. Sales representatives, research scientists, and laboratory technicians represent the greatest number of positions. From the below plot we can see that there is a slight gender bias in the workforce. The true percentage breakdown is 60% male and 40% female, but that distribution varies greatly by job role. Human resources representatives, for example, are overwhelmingly male (69.2%). Female concerns may not be sufficiently represented in the HR department and consideration should be given to adding additional female HR reps. Efforts could be made to improve female representation in general, because only a small number of roles are near a 50-50 distribution.
| Department | Job Role | Employee Count | Male Employees | Male Employee % |
|---|---|---|---|---|
| HR | Human Resources | 52 | 36 | 69.2% |
| HR | Manager | 11 | 7 | 63.6% |
| R&D | Healthcare Representative | 131 | 80 | 61.1% |
| R&D | Laboratory Technician | 259 | 174 | 67.2% |
| R&D | Manager | 54 | 30 | 55.6% |
| R&D | Manufacturing Director | 145 | 73 | 50.3% |
| R&D | Research Director | 80 | 47 | 58.8% |
| R&D | Research Scientist | 292 | 178 | 61.0% |
| Sales | Manager | 37 | 18 | 48.6% |
| Sales | Sales Executive | 326 | 194 | 59.5% |
| Sales | Sales Representative | 83 | 45 | 54.2% |
One of the primary questions requested, is an analysis of the factors that contribute to the attrition of employees. That analysis is below.
When exploring attrition by age, we can see that the most significant attrition rates are in the youngest portion of the workforce. The youngest group, <20, is reasonably expected to have high attrition as those employees are likely interns or transitioning to college / other careers.
The 20-29 age group employees, though, also have a significant attrition rate at 26.2% Perhaps this is an area that the business can target as a potential “high risk of attrition” population for retention incentives or engagement programs.
| Age Group | Employee Count | Attrition Count | Attrition % per Group |
|---|---|---|---|
| <20 | 17 | 10 | 58.8% |
| 20-29 | 309 | 81 | 26.2% |
| 30-39 | 622 | 89 | 14.3% |
| 40-49 | 349 | 34 | 9.7% |
| 50-59 | 168 | 23 | 13.7% |
| 60+ | 5 | 0 | 0.0% |
When looking at job roles, we can see that attrition is particularly high in three departments. Specifically, Human Resources, Laboratory Technician, and Sales Representatives all have high attrition rates.
Beyond looking at attrition, we can see trends in the data regarding the average monthly income of employees, and the number of years that various job roles work in a specific field.
Managers are most likely to be working the longest, closely followed by Research Directors. This is generally because these individuals represent the more senior roles in an organization. It is unclear whether this is desired by DDSAnalytics or not.
| Department | Job Role | Employee Count | % of Total Employees | Attrition Count | Attrition Rate per Dept. | Over Time % | Avg. Working Years | Avg. Mo. Income |
|---|---|---|---|---|---|---|---|---|
| HR | Human Resources | 52 | 3.5% | 12 | 23.1% | 25.0% | 8.2 | 4,236 |
| HR | Manager | 11 | 0.7% | 0 | 0.0% | 36.4% | 27.5 | 18,089 |
| R&D | Healthcare Representative | 131 | 8.9% | 9 | 6.9% | 28.2% | 14.1 | 7,529 |
| R&D | Laboratory Technician | 259 | 17.6% | 62 | 23.9% | 23.9% | 7.7 | 3,237 |
| R&D | Manager | 54 | 3.7% | 3 | 5.6% | 24.1% | 23.2 | 17,130 |
| R&D | Manufacturing Director | 145 | 9.9% | 10 | 6.9% | 26.9% | 12.8 | 7,295 |
| R&D | Research Director | 80 | 5.4% | 2 | 2.5% | 28.8% | 21.4 | 16,034 |
| R&D | Research Scientist | 292 | 19.9% | 47 | 16.1% | 33.2% | 7.7 | 3,240 |
| Sales | Manager | 37 | 2.5% | 2 | 5.4% | 27.0% | 25.6 | 16,987 |
| Sales | Sales Executive | 326 | 22.2% | 57 | 17.5% | 28.8% | 11.1 | 6,924 |
| Sales | Sales Representative | 83 | 5.6% | 33 | 39.8% | 28.9% | 4.7 | 2,626 |
| Department | Job Role | Employee Count | % of Total Employees | Attrition Count | Attrition Rate per Dept. | Over Time % | Avg. Working Years | Avg. Mo. Income |
|---|---|---|---|---|---|---|---|---|
| Sales | Sales Representative | 40 | 12.3% | 21 | 52.5% | 30.0% | 2.5 | 2,401 |
| HR | Human Resources | 13 | 4.0% | 6 | 46.2% | 23.1% | 4.5 | 3,070 |
| R&D | Laboratory Technician | 85 | 26.1% | 31 | 36.5% | 29.4% | 4.6 | 2,965 |
| R&D | Research Scientist | 95 | 29.1% | 21 | 22.1% | 30.5% | 4.4 | 2,639 |
| Sales | Sales Executive | 56 | 17.2% | 9 | 16.1% | 30.4% | 6.7 | 5,743 |
| R&D | Healthcare Representative | 14 | 4.3% | 2 | 14.3% | 28.6% | 7.9 | 6,406 |
| R&D | Manufacturing Director | 20 | 6.1% | 1 | 5.0% | 15.0% | 7.0 | 5,644 |
| Department | Job Role | Employee Count | % of Total Employees | Attrition Count | Attrition Rate per Dept. | Over Time % | Avg. Working Years | Avg. Mo. Income |
|---|---|---|---|---|---|---|---|---|
| Sales | Sales Representative | 39 | 3.6% | 10 | 25.6% | 30.8% | 7.2 | 2,907 |
| R&D | Laboratory Technician | 161 | 14.9% | 30 | 18.6% | 22.4% | 9.2 | 3,378 |
| Sales | Sales Executive | 254 | 23.4% | 46 | 18.1% | 29.5% | 12.2 | 7,201 |
| HR | Human Resources | 36 | 3.3% | 5 | 13.9% | 25.0% | 9.5 | 4,715 |
| R&D | Research Scientist | 183 | 16.9% | 23 | 12.6% | 35.0% | 9.5 | 3,570 |
| R&D | Manufacturing Director | 119 | 11.0% | 9 | 7.6% | 28.6% | 13.9 | 7,556 |
| R&D | Healthcare Representative | 115 | 10.6% | 7 | 6.1% | 28.7% | 14.9 | 7,714 |
| R&D | Manager | 53 | 4.9% | 3 | 5.7% | 22.6% | 23.5 | 17,229 |
| Sales | Manager | 37 | 3.4% | 2 | 5.4% | 27.0% | 25.6 | 16,987 |
| R&D | Research Director | 76 | 7.0% | 2 | 2.6% | 27.6% | 22.0 | 16,189 |
| HR | Manager | 11 | 1.0% | 0 | 0.0% | 36.4% | 27.5 | 18,089 |
In addition to the job role trends / contributions to attrition identified above, there are a number of contributing factors that we refer to as “Quality of Life” factors. These characteristics impact an employee’s personal life, time spent at the office, or represent their overall satisfaction with the work that they are doing. Things like Business Travel, for example, are driven by work requirements, but can negatively an employee’s ability to spend time with their family or friends. Let’s take a look at how these factors contribute:
In order to aid DDSAnalytics in their efforts to effectively identify and mitigate employees at risk of attrition, we have worked to build a model that can – based on the factors provided – achieve some level of accuracy in determining whether an employee is an attrition risk. This does not inherently mean that an employee will leave the company. It does, however, allow DDSAnalytics to understand the specific factors that contribute to attrition, and possibly to target those employees with accommodations that may aid in their retention on the team!
The model we have selected performs a logistic regression on a number of features and predicts either “1” – the employee is likey to attrite or “0” – the employee is not a risk of attrition.
Because we were only given a single set of data, we divided the data into a “training” set consisting of 2/3’s of the original data and a “test” set consisting of the remaining 1/3rd. The test set was used to validate the model and to determine the precision, recall, and accuracy statistics that are provided below as outputs.
The model takes into account all of the parameters provided – both numeric and categorical – and determines a “best fit” model to ensure the best possible prediction. The model we are developing has determined these are the most important features in rank order with their relative importance.
| Variable | Importance |
|---|---|
| OverTimeYes | 7.0359912 |
| NumCompaniesWorked | 3.5820345 |
| EnvironmentSatisfaction | 3.4643447 |
| JobRoleSales Representative | 3.3911405 |
| DistanceFromHome | 3.1655068 |
| YearsSinceLastPromotion | 3.1521373 |
| BusinessTravelTravel_Frequently | 3.1368534 |
| JobRoleSales Executive | 3.1352773 |
| JobRoleLaboratory Technician | 3.0916779 |
| RelationshipSatisfaction | 2.9050143 |
| JobInvolvement | 2.8267995 |
| YearsInCurrentRole | 2.5346660 |
| JobRoleHuman Resources | 2.4166655 |
| Age | 2.2628159 |
| BusinessTravelTravel_Rarely | 2.2578541 |
| YearsWithCurrManager | 2.2300423 |
| WorkLifeBalance | 2.1918375 |
| GenderMale | 2.0433676 |
| YearsAtCompany | 1.8496805 |
| StockOptionLevel | 1.7002759 |
| MaritalStatusSingle | 1.6795583 |
| JobRoleResearch Scientist | 1.6093804 |
| JobRoleResearch Director | 1.2464367 |
| JobRoleManufacturing Director | 1.1174693 |
| MaritalStatusMarried | 0.6253382 |
| JobRoleManager | 0.0341333 |
| x |
|---|
| 0.9569892 |
| x |
|---|
| 0.978022 |
Accuracy – âHow many times does the model get it right?â
Accuracy = 0.8742268
As you can see, on the train/test data, our model performs quite well. We are excited to try it on “real-world” data soon!
The “Top 3” contributors according to our model are:
An employee who works Overtime
An employee’s Environment Satisfaction score
The number of companies an employee has worked for.
As you can see, there are both a number of trends and a number of contributing factors to an employee’s possible attrition. How best to use this data, is up to the management at DDSAnalytics. However, we hope that having read this analysis, that you are well-postured to identify employees at risk, identify new trends in the data, and respond to these trends accordingly.